Git and GitHub

Imagine that you've just spent three weeks writing a crucial piece of software for your research and your laptop suddenly dies on you. Fortunately, you were savvy enough to make (occasional) backups but then you realize that they are all based on your old design and that none of them incorporate the radical revisioning you dreamt up two nights ago at 3 am and have spent the last 40 hours implementing, surviving only on adrenaline, cold pizza and thick coffee. It's OK, though, I'm sure that you'll remember all those changes...

Version control systems exist to manage software development, keeping track of (commented) changes you make to a code base and allowing you to backtrack, branch off, and refactor to your heart's content (assuming, of course, that you remember to keep the local copy synched with that in the master repository). There are many such systems out there, for example, Subversion (SVN), Mercurial, and Bazaar, all with their own advantages and idiosyncracies, but the one we will talk about here is Git. Although you can run your own local git repository (amd you may want to do so at your home institution), we're going to make use of GitHub, which is web-based (and hosted) repository that has a nice browser interface.

First, download and install Git for your operating system from here (assuming it's not already installed).

Second, register for an account on at GitHub - you may just want to have one GitHub account for your project so now is the time to choose whom is the most responsible member of your group and get them to sign up.

Once that is all done, you want to initialize git on your computer:

> git config --global user.name "Your Name Here"

> git config --global user.email "your_email@youremail.com"

The next part is a bit tricky and consists of generating and registering a SSH key with GitHub so that you are not constantly nagged for passwords every time you want to interact with it. If you were just going to be using git locally, this step would not be necessary (unless you're really concerned with security). The instructions for doing this step are here.

OK, now we are going to create the repositories for your code base (project): the master repository one at GitHub and the local one on your machine. First, go to GitHub (you may have to login) and click the tiny book icon next to your username (if all the icons look the same, go to the New Repository page. Give it a short memorable name, make it public (since we encourage open software), and click the green "Create Repository" button.

Although GitHub will serve as the master location for your code, the bulk of any development will happen on your machine and we mirror the repository you just created (let's assume you named it "MyProject") as a local directory:

> mkdir /some/directory/MyProject

Now you'll need to switch to this directory and initialize it for git:

> cd /some/directory/MyProject

> git init
Initialized empty Git repository in /some/directory/MyProject/.git/

And there you go, you're ready to start writing your code. Obviously at some point, you're going to want to sync the local repository with the master one (otherwise why are we bothering with this). This is a three-stage process (and if you can remember 'add-commit-push' then it is straightforward). Firstly, you need to tell git which files in your local directory you want in the master repository - you may decide initially that you want everything in it but you'll soon find that you end up with all sorts of extra stuff in the local repository that you don't need in the master, such as temporary files for testing, random notes, coffee shop receipts, etc. So for each file that you want to add:

> git add some_algorithm.py

Next you need to tell git to take a snapshot of the code you've just told it about. It's at this point that you also tell git something about this particular version of the code, for example, whether it's an initial version or which bug this change fixes or new feature it adds. These comments are visible on the public view of your code on the GitHub site so it's a good idea to be as informative as possible, otherwise:

As the cartoon shows, the command to do this is commit with a -m flag to add an inline comment. If you do not specify the flag then git will take you to a temporary file for you to add commit messages so it's easiest to just add them as inline text:

> git commit -m "Fixed the memory leak bug #15"

Finally, you need to push this all up to GitHub:

> git push

and the two repositories are now in sync (at least as far as the code you care about is concerned). If at any point you want to check which files are in the repository, which changes still need to be committed (and which branch of the repository you’re currently working on) then you can use:

> git status

Let's say now that you want a local copy of one of your colleague's codes (because they've implemented a fabulous algorithm that you want to steal also use) then you can check this out with:

> git checkout fastcode

assuming that it's the branch of your repository called "fastcode" and

>git clone http://github.com/otherProject.git

if it's in an entirely different repository (you can get the URL for this from the GitHub web page for this repository.

There are (a lot of) other git commands to manage branches, etc. and the GitHub repository has material on all of these but you always check out the syntax of a particular one locally with:

> git help command-name


In [ ]: